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(1) GENERAL INFORMATION: 

(i) APPLICANT: Kirschbaum, Bernd 
Berglund, Erick 
Meisterernst, Michael 
Polites, Greg 

(ii) TITLE OF INVENTION: PURIFICATION OF HIGHER ORDER TRANSCRIPTION 
COMPLEXES FROM TRANSGENIC NON-HUMAN ANIMALS 

(iii) NUMBER OF SEQUENCES: 17 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: HELLER, EHRMAN, WHITE & McAULIFFE 

(B) STREET: 1666 K Street, N.W., Suite 300 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20006 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 09/849,243 

(B) FILING DATE: 07-May-2001 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Granados , Patricia D. 

(B) REGISTRATION NUMBER: 33,683 

(C) REFERENCE/ DOCKET NUMBER: 38005-014 8 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (202)912-2000 

(B) TELEFAX: (202)912-2020 



INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

42 



Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Val 
15 10 



(2) INFORMATION FOR SEQ ID NO : 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO : 5: 

43 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1. .22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
GGAGCAACCG CCTGCTGGGT GC 



(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1 . . 21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
CCTGTGTTGC CTGCTGGGAC G 



(2) INFORMATION FOR SEQ ID NO : 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: exon 




(B) LOCATION: 1. .21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 



GGAGACTGAA GTTAGGCCAG C 



21 



(2) INFORMATION FOR SEQ ID NO : 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 base pairs 

(B) TYPE: nucleic acid 
(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1 . . 76 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 
GCGGCACCAG GCCGCTGCTG TGATGATGAT GATGATGGCT GCTGCCCATG ACTGCGTAAT 



GCGGTCATGA CGCTTT 



(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1. .75 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 
GAAGGGGGTG GGGGAGGCAA GGGTACATGA GAGCCATTAC GTCGTCTTCC TGAATCCCTT 

60 



76 
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TAGCCGCTTT GCTCG 



75 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1 . . 22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CCCTATGACG TCCCGGATTA CG 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1. .22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTGGAGTGGT GCCCGGCAAG GG 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



22 



22 
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0 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 


9 






Met Gly Ser Ser His His His His His His 
15 10 


Ser Ser Gly Leu Val Pro 
15 




Arg Gly Cys 








(2) INFORMATION FOR SEQ ID NO: 13: 








(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1310 base pairs 








(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 








(ii) MOLECULE TYPE: cDNA 








(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..1310 








(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 






eo 


CCATGGGCTA TCCCTATGAC GTCCCGGATT ACGCAGTCAT 


GGGCAGCAGC 


CATCATCATC 


120 


ATCATCACAG CAGCGGCCTG GTGCCGCGCG GCAGCCATAT 


GGATCAGAAC 


AACAGCCTGC 


180 


CACCTTACGC TCAGGGCTTG GCCTCCCCTC AGGGTGCCAT 


GACTCCCGGA 


ATCCCTATCT 


240 


TTAGTCCAAT GATGCCTTAT GGCACTGGAC TGACCCCACA 


GCCTATTCAG 


AACACCAATA 


300 


GTCTGTCTAT TTTGGAAGAG CAACAAAGGC AGCAGCAGCA 


ACAACAACAG 


CAGCAGCAGC 


360 


AGCAGCAGCA GCAGCAACAG CAACAGCAGC AGCAGCAGCA 


GCAGCAGCAG 


CAGCAGCAGC 


420 


AGCAGCAGCA GCAGCAGCAA CAGGCAGTGG CAGCTGCAGC 


CGTTCAGCAG 


TCAACGTCCC 


480 


AGCAGGCAAC ACAGGGAACC TCAGGCCAGG CACCACAGCT 


CTTCCACTCA 


CAGACTCTCA 


540 


CAACTGCACC CTTGCCGGGC ACCACTCCAC TGTATCCCTC 

47 


CCCCATGACT 


CCCATGACCC 



• 


CCATCACTCC TGCCACGCCA GCTTCGGAGA GTTCTGGGAT 


# 

TGTACCGCAG 


CTGCAAAATA 


600 










TTGTATCCAC AGTGAATCTT GGTTGTAAAC TTGACCTAAA 


GACCATTGCA 


CTTCGTGCCC 


660 










GAAACGCCGA ATATAATCCC AAGCGGTTTG CTGCGGTAAT 


CATGAGGATA 


AGAGAGCCAC 


720 










GAACCACGGC ACTGATTTTC AGTTCTGGGA AAATGGTGTG 


CACAGGAGCC 


AAGAGTGAAG 


780 










AACAGTCCAG ACTGGCAGCA AGAAAATATG CTAGAGTTGT 


ACAGAAGTTG 


GGTTTTCCAG 


840 










CTAAGTTCTT GGACTTCAAG ATTCAGAACA TGGTGGGGAG 


CTGTGATGTG 


AAGTTTCCTA 


900 










TAAGGTTAGA AGGCCTTGTG CTCACCCACC AACAATTTAG 


TAGTTATGAG 


CCAGAGTTAT 


960 










TTCCTGGTTT AATCTACAGA ATGATCAAAC CCAGAATTGT 


TCTCCTTATT 


TTTGTTTCTG 


1020 










GAAAAGTTGT ATTAACAGGT GCTAAAGTCA GAGCAGAAAT 


TTATGAAGCA 


TTTGAAAACA 


1080 










TCTACCCTAT TCTAAAGGGA TTCAGGAAGA CGACGTAATG 


GCTCTCATGT 


ACCCTTGCCT 


1140 










CCCCCACCCC CTTCTTTTTT TTTTTTTAAA CAAATCAGTT 


TGTTTTGGTA 


CCTTTAAATG 


1200 










GTGGTGTTGT GAGAAGATGG ATGTTGAGTT GCAGGGTGTG 


GCACCAGGTG 


ATGCCCTTCT 


1260 










GTAAGTGCCC CTTCCGGCAT CCCGGAATTC CTGCAGCCCA 


ACGCGGCCGC 




1310 










(2) INFORMATION FOR SEQ ID NO: 14: 








(i) SEQUENCE CHARACTERISTICS: 








(A) LENGTH: 42 86 base pairs 








(B) TYPE: nucleic acid 








(C) STRAND EDNESS : single 








(D) TOPOLOGY: linear 








(ii) MOLECULE TYPE: cDNA 








(ix) FEATURE: 








(A) NAME/KEY: exon 








(B) LOCATION: 1..4286 








(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
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60 


GAATTCCCCT 


# 

GCAGGTCACT 


TAGCGTTGGC 


CACATAGTAG 


• 

GTTCTCAAAT 


ACTTGTTAAT 


120 


AAATAAGTTT 


GTTCGAGAAG 


CTGGGCAATG 


ATATTCTACA 


GCTGGAAGAA 


GAAACATAAT 


180 


GATCTAGTAA 


TTAGCTCAAT 


TAAAAATAAA 


CGTTCTTCTT 


TCCTCAGAGG 


AGCATTTCCC 


240 


AAGGCCTGCC 


TTGATAGCCA 


TCCAAAAAGG 


CCAAGCTCAT 


CCAATCTTGC 


CCTAGATTTA 


300 


TGCTAAAATG 


CAGTTACAAT 


CGATAGGATG 


ACAGAAAACG 


ACAGCACTTA 


TTTAAATATA 


360 


ATAGGCACTT 


ATTTAAATAG 


GAGAAGCTGT 


GACTTCATAG 


CAAGTGTTGG 


GGTTAGGAAA 


420 


CTGGGTGGAT 


AAACTTGCTG 


ATGCTGTAGA 


TCTTAGCCTC 


TACATGAGAT 


CATGTGGAAA 


480 


ATCTGAAAGC 


ATTTTAGGTT 


CCTTATGTTT 


GCAATCAAAT 


AACTGTACAC 


CTTTTAATTT 


540 


AAAAAGT AC C 


ATGAGGCACA 


CACACACACT 


CGCAGGAACT 


TTTTGGCGTA 


ACAAAACTAG 


600 


AATTAGATCT 


AAAAGCTAAC 


TGTAGGACTG 


AGTCTATTCT 


AAACTGAAAG 


CCTGGACATC 


660 


TGGAGTACCA 


GGGGGAGATG 


ACGTGTTACG 


GGCTTCCATA 


AAAGCAGCTG 


GCTTTGAATG 


720 


GAAGGAGCCA 


AGAGGCCAGC 


ACAGGAGCGG 


ATTCGTCGCT 


TTCACGGCCA 


TCGAGCCGAA 


780 


CCTCTCGCAA 


GTCCGTGAGC 


CGTTAAGGAG 


GCCCCCAGTC 


CCGACCCTTC 


GCCCCAAGCC 


840 


CCTCGGGGTC 


CCCGGGCCTG 


GTACTCCTTG 


CCACACGGGA 


GGGGCGCGGA 


AGCCGGGGCG 


900 


GAGGAGGAGC 


CAACCCCGGG 


CTGGGCTGAG 


ACCCGCAGAG 


GAAGACGCTC 


TAGGGATTTG 


960 


TCCCGGACTA 


GCGAGATGGC 


AAGGCTGAGG 


ACGGGAGGCT 


GATTGAGAGG 


CGAAGGTACA 


1020 


CCCTAATCTC 


AATACAACCT 


TTGGAGCTAA 


GCCAGCAATG 


GTAGAGGGAA 


GATTCTGCAC 


1080 


GTCCCTTCCA 


GGCGGCCTCC 


CCGTCACCAC 


CCCCCCCAAC 


CCGCCCCGAC 


CGGAGCTGAG 


1140 


AGTAATTCAT 


ACAAAAGGAC 


TCGCCCCTGC 


CTTGGGGAAT 


CCCAGGGACC 


GTCGTTAAAC 
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1200 


TCCCACTAAC 


GTAGAACCCA 


GAGATCGCTG 


CGTTCCCGCC 


# 

CCCTCACCCG 


CCCGCTCTCG 


1260 


TCATCACTGA 


GGTGGAGAAG 


AGCATGCGTG 


AGGCTCCGGT 


GCCCGTCAGT 


GGGCAGAGCG 


1320 


CACATCGCCC 


ACAGTCCCCG 


AGAAGT TGGG 


GGGAGGGGTC 


GGCAATTGAA 


CCGGTGCCTA 


1380 


GAGAAGGTGG 


CGCGGGGTAA 


ACTGGGAAAG 


TGATGTCGTG 


TACTGGCTCC 


GCCTTTTTCC 


1440 


CGAGGGTGGG 


GGAGAACCGT 


ATATAAGTGC 


AGTAGTCGCC 


GTGAACGTTC 


TTTTTCGCAA 


1500 


CGGGTTTGCC 


GCCAGAACAC 


AGGTAAGTGC 


CGTGTGTGGT 


TCCCGCGGGC 


CTGGCCTCTT 


1560 


TACGGGTTAT 


GGCCCTTGCG 


TGCCTTGAAT 


TACTTCCACG 


CCCCTGGCTG 


CAGTACGTGA 


1620 


TTCTTGATCC 


CGAGCTTCGG 


GTTGGAAGTG 


GGTGGGAGAG 


TTCGAGGCCT 


TGCGCTTAAG 


1680 


GAGCCCCTTC 


GCCTCGTGCT 


TGAGTTGAGG 


CCTGGCCTGG 


GCGCTGGGGC 


CGCCGCGTGC 


1740 


GAATCTGGTG 


GCACCTTCGC 


GCCTGTCTCG 


CTGCTTTCGA 


TAAGTCTCTA 


GCCATTTAAA 


1800 


ATTTTTGATG 


ACCTGCTGCG 


ACGCTTTTTT 


TCTGGCAAGA 


TAGTCTTGTA 


AATGCGGGCC 


1860 


AAGATCTGCA 


CACTGGTATT 


TCGGTTTTTG 


GGGCCGCGGG 


CGGCGACGGG 


GCCCGTGCGT 


1920 


CCCAGCGCAC 


ATGTTCGGCG 


AGGCGGGGCC 


TGCGAGCGCG 


GCCACCGAGA 


ATCGGACGGG 


1980 


GGTAGTCTCA 


AGCTGGCCGG 


CCTGCTCTGG 


TGCCTGGCCT 


CGCGCCGCCG 


TGTATCGCCC 


2040 


CGCCCTGGGC 


GGCAAGGCTG 


GCCCGGTCGG 


CACCAGTTGC 


GTGAGCGGAA 


AGATGGCCGC 


2100 


TTCCCGGCCC 


TGCTGCAGGG 


AGCTCAAAAT 


GGAGGACGCG 


GCGCTCGGGA 


GAGCGGGCGG 


2160 


GTGAGTCACC 


CACACAAAGG 


AAAAGGGCCT 


TTCCGTCCTC 


AGCCGTCGCT 


TCATGTGACT 


2220 


CCACGGAGTA 


CCGGGCGCCG 


TCCAGGCACC 


TCGATTAGTT 


CTCGAGCTTT 


TGGAGTACGT 


2280 


CGTCTTTAGG 


TTGGGGGGAG 


GGGTTTTATG 


CGATGGAGTT 


TCCCCACACT 


GAGTGGGTGG 
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2340 


AGACTGAAGT 


# 

TAGGCCAGCT 


TGGCACTTGA 


TGTAATTCTC 


• 

CTTGGAATTT 


GCCCTTTTTG 


2400 


AGTTTGGATC 


TTGGTTCATT 


CTCAAGCCTC 


AGACAGTGGT 


TCAAAGTTTT 


TTTCTTCCAT 


2460 


TTCAGGTGTC 


GTGAGGAATT 


GCCCGGGGGA 


TCCATGGGCT 


ATCCCTATGA 


CGTCCCGGAT 


2520 


TACGCAGTCA 


TGGGCAGCAG 


CCATCATCAT 


CATCATCACA 


GCAGCGGCCT 


GGTGCCGCGC 


2580 


GGCAGCCATA 


TGGATCAGAA 


CAACAGCCTG 


CCACCTTACG 


CTCAGGGCTT 


GGCCTCCCCT 


2640 


CAGGGTGCCA 


TGACTCCCGG 


AATCCCTATC 


TTTAGTCCAA 


TGATGCCTTA 


TGGCACTGGA 


2700 


CTGACCCCAC 


AGCCTATTCA 


GAACACCAAT 


AGTCTGTCTA 


TTTTGGAAGA 


GCAACAAAGG 


2760 


CAGCAGCAGC 


AACAACAACA 


GCAGCAGCAG 


CAGCAGCAGC 


AGCAGCAACA 


GCAACAGCAG 


2820 


CAGCAGCAGC 


AGCAGCAGCA 


GCAGCAGCAG 


CAGCAGCAGC 


AGCAGCAGCA 


ACAGGCAGTG 


2880 


GCAGCTGCAG 


CCGTTCAGCA 


GTCAACGTCC 


CAGCAGGCAA 


CACAGGGAAC 


CTCAGGCCAG 


2940 


GCACCACAGC 


TCTTCCACTC 


ACAGACTCTC 


ACAACTGCAC 


CCTTGCCGGG 


CACCACTCCA 


3000 


CTGTATCCCT 


CCCCCATGAC 


TCCCATGACC 


CCCATCACTC 


CTGCCACGCC 


AGCTTCGGAG 


3060 


AGTTCTGGGA 


TTGTACCGCA 


GCTGCAAAAT 


ATTGTATC C A 


CAGTGAATCT 


TGGTTGTAAA 


3120 


CTTGACCTAA 


AGACCATTGC 


ACTTCGTGCC 


CGAAACGCCG 


AATATAATCC 


CAAGCGGTTT 


3180 


GCTGCGGTAA 


TCATGAGGAT 


AAGAGAGCCA 


CGAACCACGG 


CACTGATTTT 


CAGTTCTGGG 


3240 


AAAATGGTGT 


GCACAGGAGC 


CAAGAGTGAA 


GAACAGTCCA 


GACTGGCAGC 


AAGAAAATAT 


3300 


GCTAGAGTTG 


T AC AG AAGTT " 


GGGTTTTCCA 


GCTAAGTTCT 


TGGACTTCAA 


GATTCAGAAC 


3360 


ATGGTGGGGA 


GCTGTGATGT 


GAAGTTTCCT 


ATAAGGTTAG 


AAGGCCTTGT 


GCTCACCCAC 


3420 


CAACAATTTA 


GTAGTTATGA 


GCCAGAGTTA 


TTTCCTGGTT 


TAATCTACAG 


AATGATCAAA 
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# 



CCCAGAATTG TTCTCCTTAT TTTTGTTTCT GGAAAAGTTG TATTAACAGG TGCTAAAGTC 

3480 

AGAGCAGAAA TTTATGAAGC ATTTGAAAAC ATCTACCCTA TTCTAAAGGG ATTCAGGAAG 

3540 

ACGACGTAAT GGCTCTCATG TACCCTTGCC TCCCCCACCC CCTTCTTTTT TTTTTTTTAA 

3600 

ACAAATCAGT TTGTTTTGGT ACCTTTAAAT GGTGGTGTTG TGAGAAGATG GATGTTGAGT 

3660 

TGCAGGGTGT GGCACCAGGT GATGCCCTTC TGTAAGTGCC CCTTCCGGCA TCCCGGATAT 
3720 



CCTGCAGCCC AACACGGCCG CTCGAGCATG CATCTAGAGA ACGTCACGGC CGCGATCCCC 

3780 

CTGTGCCTTC TAGTTGCCAG CCATCTGGTT GTTTGCCCCT CCCCCGTGCC TTCCTTGACC 

3840 

CTGGAAGGTG CCACTCCCAC TGTCCTTTCC TAATAAAATG AGGAAATTGC ATCGCATTGT 

3900 

CTGAGTAGGT GTCATTCTAT TCTGGGGGGT GGGGTGGGGC AGG AC AG C AA GGGGGAGGAT 

3960 

TGGGAAGACA ATAGCAGGCA TGCTGGGGAT GCGGTGGGCT CTATGGGTAC CCAGGTGCTG 

4020 

AAGAATTGAC CCGGTTCCTC CTGGGC C AGA AAGAAGCAGG CACATCCCCT TCTCTGTGAC 

4080 

ACACCCTGTC CACGCCCCTG GTTCTTAGTT CCAGCCCCAC T CAT AGG AC A CTCAACTTGG 

4140 

AGCGGTCTCT CCCTCCCTCA TCAGCCCACC AAACCAAACC TAGCCTCCAA GAGTGGGAAG 

4200 

AAATTAAAGC AAGAAGGCTA TTAAGTGCAG AGGGAGAGAA AATGCCTCCA ACATGTGAGG 

4260 

AAGTAATGAT AGAAATCATA GAATTC 

4286 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3263 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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• 

(ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..3263 




* 






(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 




60 


ATCGATAAGC 


TGAGATCCGG 


CTAGAAACTG 


CTGAGGGCTG 


GACCGCATCT 


GGGGACCATC 


120 


TGTTCTTGGC 


CCTGAGCGGG 


GCAGGAACTG 


CTTACCGCAG 


ATATCCTGTT 


TGCCCCAATT 




CAGCTGTTCC 


ATCTGTTCTT 


GGCCCTGAGC 


GGGGCAGGAA 


CTGCTTACCA 


CAGATATCCT 


180 














240 


GTTTGGCCCA 


TATTCAGCTG 


TCTCTCTGTT 


CCTGACCTTG 


ATCTGAACTT 


CTCTATTCTC 


300 


AGTTATGTAT 


TTTTCCCATG 


CCTTGCAAAA 


TGGCGTTACT 


TAAGCTAGCT 


TGCCAAACCT 


360 


ACGGCTGGGG 


TCTTTCACGT 


TTATATCTAT 


GAGGGGAAGG 


ACCCAGAGTG 


GGGAAGCTGG 


420 


GATCTTGGGA 


ACACGCTTCT 


CTACATGGCA 


TTGTCTGCAC 


GGTGGAGTCC 


GGATCTGAGC 


480 


TTGGCTTGGT 


TTTTAAAACC 


AGCCTGGAGT 


AGAGCAGATG 


GGTTAAGGTG 


AGTGACCCCT 


540 


CAGCCCTGGA 


CATTCTTAGA 


TGAGCCCCCT 


CAGGAGTAGA 


GAATAATGTT 


GAGATGAGTT 


600 


CTGTTGGCTA 


AAATAATCAA 


GGCTAGTCTT 


TATAAAACTG 


TCTCCTCTTC 


TCCTAGCTTC 


660 


GATCCAGAGA 


GAGACCTGGG 


CGGAGCTGGT 


CGCTGCTCAG 


GAACTCCAGG 


AAAGGAGAAG 


720 


CTGAGGTTAC 


CACGCTGCGA 


ATGGGTTTAC 


GGAGATAGCT 


GGCTTTCCGG 


GGTGAGTTCT 


780 


CGTAAACTCC 


AGAGCAGCGA 


TAGGCCGTAA 


TATCGGGGAA 


AGCACTATAG 


GGACATGATG 


840 


TTCCACACGT 


CACATGGGTC 


GTCCTATCCG 


AGCCAGTCGT 


GCCAAAGGGG 


CGGTCCCGCT 


900 


GTGCACACTG 


GCGCTCCAGG 


GAGCTCTGCA 


CTCCGCCCGA 


AAAGTGCGCT 


CGGCTCTGCC 


960 


AGGACGCGGG 


GCGCGTGACT 


ATGCGTGGGC 


TGGAGCAACC 


GCCTGCTGGG 


TGCAAACCCT 








53 








TTGCGCCCGG ACTCGTCCAA CGACTATAAA GAGGGCAGGC TGTCCTCTAA GCGTCACCAC 

1020 

GACTTCAACG TCCTGAGTAC CTTCTCCTCA CTTACTCCGT AGCTCCAGCT TCACCAGATC 

1080 

CTCGAGAACG TCTCCCATGG GCTATCCCTA TGACGTCCCG GATTACGCAG TCATGGGCAG 

1140 

CAGCCATCAT CATCATCATC ACAGCAGCGG CCTGGTGCCG CGCGGCAGCC ATATGGATCA 

1200 

GAACAACAGC CTGCCACCTT ACGCTCAGGG CTTGGCCTCC CCTCAGGGTG CCATGACTCC 

1260 



CGGAATCCCT ATCTTTAGTC CAATGATGCC TTATGGCACT GGACTGACCC CACAGCCTAT 

1320 

TCAGAACACC AATAGTCTGT CTATTTTGGA AGAGCAACAA AGGCAGCAGC AGCAACAACA 

1380 

ACAGCAGCAG CAGCAGCAGC AGCAGCAGCA ACAGCAACAG CAGCAGCAGC AGCAGCAGCA 

1440 

GCAGCAGCAG CAGCAGCAGC AGCAGCAGCA GCAACAGGCA GTGGCAGCTG CAGCCGTTCA 

1500 

GCAGTCAACG TCCCAGCAGG CAACACAGGG AACCTCAGGC CAGGCACCAC AGCTCTTCCA 

1560 

CTCACAGACT CTCACAACTG CACCCTTGCC GGGCACCACT CCACTGTATC CCTCCCCCAT 

1620 

GACTCCCATG ACCCCCATCA CTCCTGCCAC GCCAGCTTCG GAGAGTTCTG GGATTGTACC 

1680 

GCAGCTGCAA AATATTGTAT CCACAGTGAA TCTTGGTTGT AAACTTGACC T AAAG AC CAT 

1740 

TGCACTTCGT GCCCGAAACG CCGAATATAA TCCCAAGCGG TTTGCTGCGG TAATCATGAG 

1800 

GATAAGAGAG CCACGAACCA CGGCACTGAT TTTCAGTTCT GGGAAAATGG TGTGCACAGG 

1860 

AGCCAAGAGT GAAGAACAGT CCAGACTGGC AGCAAGAAAA TATGCTAGAG TTGTACAGAA 

1920 

GTTGGGTTTT CCAGCTAAGT TCTTGGACTT CAAGATTCAG AACATGGTGG GGAGCTGTGA 

1980 

TGTGAAGTTT CCTATAAGGT TAGAAGGCCT TGTGCTCACC CACCAACAAT TTAGTAGTTA 

2040 

TGAGCCAGAG TTATTTCCTG GTTTAATCTA CAGAATGATC AAACCCAGAA TTGTTCTCCT 

2100 
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TATTTTTGTT TCTGGAAAAG TTGTATTAAC AGGTGCTAAA GTCAGAGCAG AAATTTATGA 

2160 

AGCATTTGAA AACATCTACC CTATTCTAAA GGGATTCAGG AAGACGACGT AATGGCTCTC 

2220 

ATGTACCCTT GCCTCCCCCA CCCCCTTCTT TTTTTTTTTT TAAACAAATC AGTTTGTTTT 

2280 

GGTACCTTTA AATGGTGGTG TTGTGAGAAG ATGGATGTTG AGTTGCAGGG TGTGGCACCA 

2340 

GGTGATGCCC TTCTGTAAGT GCCCCTTCCG GCATCCCGGA ATTCCTGCAG CCCAACGCGG 

2400 



CCGCTTCGAG GGATCTTTGT GAAGGAACCT TACTTCTGTG GTGTGACATA ATTGGACAAA 

2460 

CTACCTACAG AGATTTAAAG CTCTAAGGTA AATATAAAAT TTTTAAGTGT ATAATGTGTT 

2520 

AAACTACTGA TTCTAATTGT TTGTGTATTT TAGATTCCAA CCTATGGAAC TGATGAATGG 

2580 

GAGCAGTGGT GGAATGCCTT TAATGAGGAA AACCTGTTTT GCTCAGAAGA AATGCCATCT 

2640 

AGTGATGATG AGGCTACTGC TGACTCTCAA CATTCTACTC CTCCAAAAAA GAAGAGAAAG 

2700 

GTAGAAGACC CCAAGGACTT TCCTTCAGAA TTGCTAAGTT TTTTGAGTCA TGCTGTGTTT 

2760 

AGTAATAGAA CTCTTGCTTG CTTTGCTATT T AC AC C AC AA AGGAAAAAGC TGCACTGCTA 

2820 

TACAAGAAAA TTATGGAAAA ATATTCTGTA ACCTTTATAA GTAGGCATAA CAGTTATAAT 

2880 

CATAACATAC TGTTTTTTCT TACTCCACAC AGGCATAGAG TGTCTGCTAT TAATAACTAT 

2940 

GCTCAAAAAT TGTGTACCTT TAGCTTTTTA ATTTGTAAAG GGGTTAATAA GGAATATTTG 

3000 

ATGTATAGTG CCTTGACTAG AG AT C AT AAT CAGCCATACC ACATTTGTAG AGGTTTTACT 

3060 

TGCTTTAAAA AACCTCCCAC ACCTCCCCCT GAACCTGAAA CATAAAATGA ATGCAATTGT 

3120 

TGTTGTTAAC TTGTTTATTG CAGCTTATAA TGGTTACAAA TAAAGCAATA GCATCACAAA 

3180 

TTTCACAAAT AAAGCATTTT TTTCACTGCA TTCTAGTTGT GGTTTGTCCA AACTCATCAA 

3240 
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• • 

TGTATCTTAT CATGTCTGGA TCC 

3263 

(2) INFORMATION FOR SEQ ID NO : 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 


(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO 


16 














Met 


ijj.y 


lyir Firo 


Tyr 


ASp val JriO 


Asp 


Tyr 


7\ 1 — i 

Aia. 


vai 


rQeC 


C* 1 t r 

oiy 


Ser 


Ser 


1 






5 






10 










15 




His 


rllS 


rilS xllS 


TJ X — 

ill s 


TT -I m Cor nv 

rilS oci oci 


Pi \r 


Leu 


17-, "I 

v ai 


Pr*o 


Arg 


ijiy 


O f~\ -y 


ill o 






20 






25 










30 






Met 


Asp 


Gin Asn 


Asn 


Ser" Leu Pro 


Pro 


Tyr 


Ala 


Gin 


Cjiy 


Leu 


Ala 


Ser 1 






35 




40 










45 








Pro 


tjj.n 


uiy Aia 


Me t 


inir .fro Liiy 


lie 


Pro 


lie 


Fne 


Ser - 


Pro 


Mn4- 

rlct 


Net 




50 






55 








60 










Pro 


Tyr 


uiy inr 


\j±Y 


lieu inr Fro 


Pi n 


Pro 


lie 


Li±n 


Asn 


Thr 


Asn 


C A V 

OCX. 


65 








70 






75 










80 


Leu 


Ser 


lie Leu 


/"•"In 


fin /"""In Pin 

vjjj.u bin uJ.n 


Arcj 


pin 


pi n 
bin 


Ljj.n 


pi -rt 
uin 


Pin 

bin 


P.1 n 

OJ.Il 










85 






90 










95 




Gin 


Gin 


Gin Gin 


Gin 


Gin Gin Gin 


Gin 


r>~\ ti 

(jj.n 


pi t-i 

bin 


bin 


pi n 


Pin 

uin 


Gin 


Gin 






100 






105 










110 






Gin 


Gin 


Gin Gin 


Gin 


Gin Gin Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Ala 






115 




120 










125 








Val 


Ala 


Ala Ala 


Ala 


Val Gin Gin 


Ser 


Thr 


Ser 


Gin 


Gin 


Ala 


Thr 


Gin 




130 






135 








140 










Glv 


Thr 


Ser Gly 


Gin 


Ala Pro Gin 


Leu 


Phe 


His 


Ser 


Gin 


Thr 


Leu 


Thr 


145 








150 






155 










160 


Thr 


Ala 


Pro Leu 


Pro 


Gly Thr Thr 


Pro 


Leu 


Tyr 


Pro 


Ser 


Pro 


Met 


Thr 








165 






170 










175 




Pro 


Met 


Thr Pro 


He 


Thr Pro Ala 


Thr 


Pro 


Ala 


Ser 


Glu 


Ser 


Ser 


Gly 






180 






185 










190 






He 


Val 


Pro Gin 


Leu 


Gin Asn He 


Val 


Ser 


Thr 


Val 


Asn 


Leu 


Gly Cys 






195 




200 










205 








Lys 


Leu 


Asp Leu 


Lys 


Thr He Ala 


Leu 


Arg 


Ala 


Arg 


Asn 


Ala 


Glu 


Tyr 
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210 215 220 

Asn Pro Lys Arg Phe Ala Ala Val lie Met Arg lie Arg Glu Pro Arg 
225 230 235 240 

Thr Thr Ala Leu lie Phe Ser Ser Gly Lys Met Val Cys Thr Gly Ala 
245 250 255 

Lys Ser Glu Glu Gin Ser Arg Leu Ala Ala Arg Lys Tyr Ala Arg Val 
260 265 270 

Val Gin Lys Leu Gly Phe Pro Ala Lys Phe Leu Asp Phe Lys lie Gin 
275 280 285 

Asn Met Val Gly Ser Cys Asp Val Lys Phe Pro lie Arg Leu Glu Gly 



(end) 



290 295 300 



Leu Val Leu Thr His Gin Gin Phe Ser Ser Tyr Glu Pro Glu Leu Phe 
305 310 315 320 

Pro Gly Leu lie Tyr Arg Met lie Lys Pro Arg lie Val Leu Leu lie 
325 330 335 

Phe Val Ser Gly Lys Val Val Leu Thr Gly Ala Lys Val Arg Ala Glu 
340 345 350 

lie Tyr Glu Ala Phe Glu Asn lie Tyr Pro lie Leu Lys Gly Phe Arg 
355 360 365 

Lys Thr Thr 
370 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 



Arg Gly 
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